Skip to content

Conversation

@Days-gone
Copy link

Purpose

verify the successful running of dummy-llama3.2-1B on the vllm-metax.

Test Plan

run the test python code after setting up the environment, the test inference code is under docs/models/samples/dummy_llama_2.py

Test Result

Screenshot from 2025-11-14 18-35-17

Documentation Update

Update the doc of docs/models/supported_models.md, adding new model in the table of text-generate model.

@Days-gone Days-gone changed the base branch from master to v0.11.0-dev November 14, 2025 10:51
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support and verification for the dummy-llama3.2-1B model. The changes include significant updates to the installation and quickstart documentation, new sample scripts for dummy-llama3.2-1B and qwen3vl_2b, and updates to the supported models list. Additionally, a patch for rejection_sampler is introduced, likely for performance optimization, and a bug fix is applied in deepseek_v2.py. My review has identified a few high-severity issues in the new sample scripts related to missing dependencies in their documentation, and a critical issue in the new rejection_sampler.py patch concerning a potential division-by-zero error that could lead to incorrect sampling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant